Structured Solution Methods for Non-Markovian Decision Processes

نویسندگان

  • Fahiem Bacchus
  • Craig Boutilier
  • Adam J. Grove
چکیده

Markov Decision Processes (MDPs), currently a popular method for modeling and solving decision theoretic planning problems, are limited by the Markovian assumption: rewards and dynamics depend on the current state only, and not on previous history. Non-Markovian decision processes (NMDPs) can also be defined, but then the more tractable solution techniques developed for MDP’s cannot be directly applied. In this paper, we show how an NMDP, in which temporal logic is used to specify history dependence, can be automatically converted into an equivalent MDP by adding appropriate temporal variables. The resulting MDP can be represented in a structured fashion and solved using structured policy construction methods. In many cases, this offers significant computational advantagesover previous proposals for solving NMDPs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Properties of Planning with Non-Markovian Rewards

We examine technologies designed to solve decision processes with non-Markovian rewards (NMRDPs). More specifically, target decision processes exhibit Markovian dynamics, called grounded dynamics, and desirable behaviours are modelled as state trajectories specified in a temporal logic. Each technology operates by automatically translating NMRDPs into corresponding equivalent MDPs amenable to c...

متن کامل

Structured Sohtion Methods for

Markov Decision Processes (MDPs), currently a popular method for modeling and solving decision theoretic planning problems, are limited by the Markovian assumption: rewards and dynamics depend on the current state only, and not on previous history. Non-Markovian decision processes (NMDPs) can also be defined, but then the more tractable solution techniques developed for MDP’s cannot be directly...

متن کامل

Decision-Theoretic Planning with non-Markovian Rewards

A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovi...

متن کامل

Implementation and Comparison of Solution Methods for Decision Processes with Non-Markovian Rewards

This paper examines a number of solution meth­ ods for decision processes with non-Markovian rewards (NMRDPs). Tlu::y all t:xploit a temporal logic specification of the reward function to au­ tomatically translate the NMROP into an equiv­ alent Markov decision process (MOP) amenable to well-known MOP solution methods. They dif­ fer however in the representation of the target MOP and the class o...

متن کامل

Anytime State-Based Solution Methods for Decision Processes with non-Markovian Rewards

A popular approach to solving a decision process with non-Markovian rewards (NMRDP) is to exploit a compact representation of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to our favorite MDP solution method. The contribution of this paper is a representation of non-Markovian reward functions and a translation into MDP aimed a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997